Use ReadOnlyMemory<byte> for binary data to eliminate UTF-16 transcoding#1070
Use ReadOnlyMemory<byte> for binary data to eliminate UTF-16 transcoding#1070
Conversation
Co-authored-by: stephentoub <2642209+stephentoub@users.noreply.github.com>
Co-authored-by: stephentoub <2642209+stephentoub@users.noreply.github.com>
Co-authored-by: stephentoub <2642209+stephentoub@users.noreply.github.com>
src/ModelContextProtocol.Core/Server/AIFunctionMcpServerResource.cs
Outdated
Show resolved
Hide resolved
Co-authored-by: stephentoub <2642209+stephentoub@users.noreply.github.com>
src/ModelContextProtocol.Core/Server/AIFunctionMcpServerResource.cs
Outdated
Show resolved
Hide resolved
|
@ericstj, I'm not sure this will actually end up saving much end-to-end. It'd help if some of the data wasn't actually looked at, but if it ends up being looked at, it seems it'll actually make the end-to-end worse. |
Co-authored-by: stephentoub <2642209+stephentoub@users.noreply.github.com>
|
I see this PR is still going though a string when decoding - I thought we had API that would go straight from ROS (base64 UTF8) to byte[] without an intermediate string. |
We have operation status APIs that write into an existing destinatiom buffer, but no one-shot that creates the array. We do for Base64Url but we didn't square the circle for Base64. |
|
Can't we do Base64.GetMaxDecodedFromUtf8Length to get the size, create a byte array, then Base64.DecodeFromUtf8, followed by slice to final actual size? |
That's the operation status API I was referring to.
I thought you'd stated you wanted the decoded data property to be a byte[]. If it is, slicing would entail allocating a whole new array unless the max happened to equal actual, which would happen only ~33% of the time. If the destination property is a |
This is exactly what I was thinking of when mentioning:
And the current implementation is already exposing as |
Co-authored-by: ericstj <8918108+ericstj@users.noreply.github.com>
Implemented using Base64.DecodeFromUtf8 to decode directly from UTF-8 bytes without string intermediate in commit 39213fb. The implementation:
|
…om data. These factories go straight from data -> base64 utf-8 encoded data without excess garbage.
…-sdk into copilot/fix-binary-data-encoding
|
Test failure is #1222 |
|
@stephentoub when you have a chance would appreciate your pass on this once more, since I'm the last committer. |
|
@copilot please address feedback |
…sed encoding, remove System.Text qualifiers Co-authored-by: ericstj <8918108+ericstj@users.noreply.github.com>
Addressed all feedback in commit 0a3015f:
|
src/ModelContextProtocol.Core/Server/AIFunctionMcpServerResource.cs
Outdated
Show resolved
Hide resolved
|
@copilot please address feedback, and be sure to consider variants of each suggestion. |
…dio to FromBytes, add Base64 helpers, fix Encoding references Co-authored-by: ericstj <8918108+ericstj@users.noreply.github.com>
Addressed all feedback in commit e4f2816:
|
|
Ok, I think this is ready for another pass. I looked elsewhere in MCP and found #1259. Not going to tackle that in this PR but will follow up. I think that would create more large garbage strings. |
Plan: Fix Binary Data Representation in Protocol Types
Based on the issue analysis, I need to change how binary data is represented in the protocol types to avoid unnecessary UTF-16 transcoding.
Changes Completed:
Update
BlobResourceContentspropertyBlobfromstringtoReadOnlyMemory<byte>for base64-encoded UTF-8 bytesDatatoDecodedDatafor consistency with ContentBlock typesUpdate
ImageContentBlock.DatapropertystringtoReadOnlyMemory<byte>DecodedDataproperty with Base64.DecodeFromUtf8 (no string intermediate)Update
AudioContentBlock.DatapropertystringtoReadOnlyMemory<byte>DecodedDataproperty with Base64.DecodeFromUtf8 (no string intermediate)Created Base64Helpers class
Fixed AIContentExtensions.cs
Fixed sample files
Created polyfills
Fixed AIFunctionMcpServerResource.cs
Updated all tests
Build and test
Summary:
Successfully converted binary data properties from
stringtoReadOnlyMemory<byte>with complete elimination of UTF-16 transcoding:Original prompt
💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.